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3D Extinction Mapping Using Hierarchical Bayesian Models 
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ABSTRACT 

The Galaxy and the stars in it form a hierarchical system, such that the properties of 
individual stars are influenced by those of the Galaxy. Here, an approach is described which 
uses hierarchical Bayesian models to simultaneously and empirically determine the mean 
distance-extinction relationship for a sightline and the properties of stars which populate it. 
By exploiting the hierarchical nature of the problem, the method described is able to achieve 
significantly improved precision and accuracy with respect to previous 3D extinction mapping 
techniques. 

This method is not tied to any individual survey and could be applied to any observations, 
or combination of observations available. Furthermore, it is extendible and, in addition, could 
be employed to study Galactic structure as well as factors such as the initial mass function 
and star formation history in the Galaxy. 
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1 INTRODUCTION 

The Milky Way Galaxy occupies a unique role in astronomy. As we 
are located within it, we are able to observe and analyse it and its 
constituents in a manner not possible with any other galaxy. How- 
ever, this also means that we lack a global view of it. Thus, in order 
to analyse the Galaxy's structure and history we are forced to infer 
distances to stars. A task considerably complicated by the require- 
ment to disentangle the effects of interstellar extinction. 

Recent years have been characterised by a growth in sur- 
vey astronomy. Wide area photo metric surveys suc h as 2MASS 
dSkrutskieet"al] |2006) and SDSS dYork et alj|200Ch have revolu- 
tionised astronomy on all scales, from brown dwarfs to cosmology. 
Meanwhile, there has also been significant development in surveys 
specifically studying the Galactic Plane, which is home to the ma- 
jority of stars in the Galaxy. Such Galactic plane surveys include 
IPHAS (I NT/WFC Photome tric Ha Survey of the Northern Galac- 
tic Plane, iDrew et alj|2005l) , UVEX (The UV -Excess Survey of 
the Northern Galactic Plane, iGroot et al.ll2009h and the imminent 
VPHAS+ in the optical, which will collectively provide narrow 
and broadband photometry for the entire Galactic disc (\b\ < 5). 
Meanwhile, in the near infrared UKIDSS -GPS (UKIRT Inf rared 
Deep Sky Survey - Galactic Plane Survey, Lucas et al. 2008) and 
VVV (Vista Variables in the Via Lactea, [Minniti et al.. l201Ch will, 
between them, cover much of the Galactic plane. The photometric 
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surveys ha ve been accompan ied by spectros copic surveys, incl ud- 
ing RAVE dSiebert et alj|201 lh and SEGUE dYannv et alj|2009h . 

All these surveys have massively increased the volume of data 
available on our Galaxy and, with the right analytical tools, should 
allow us to develop an improved understanding of the Galaxy. This 
trend for the growth in survey astronomy is set to continue in the 
next decade; future facilities such as Gaia and LSST will gather 
data at a hitherto unseen rate. As such, an ongoing problem is how 
best to harne ss that data in order to extract as much information as 
possible (e.g. Binney 201 lh . 

Ijuric et al] (2008) used stars from SDSS to study the stel- 
lar number density distribution at high Galactic latitudes (mostly 
\b\ > 25°, determining large scale properties of the Galaxy and 
discovering localis ed over densities in the Galactic stellar halo. 
Ilvezic et"al] d2008h follo wed this by a l so an alysing the metallic- 
ity of stars from SDSS. ICarollo et all d2010l) . again using SDSS 
data, claim that they have discovered a second halo co mponent. 
Though, as argued by Schonrich, Asplund & Casagrand^ d201 lh . it 
appears that this second component may be an artefact stemming 
fr om a substantial bias present in the distance estimates employed 
bv lCarolloetai]d2010h . 

Studies, such as those discussed in the previous paragraph, 
demonstrate both the potential rewards and possible pitfalls asso- 
ciated with using large survey datasets. Similarly, it is becoming 
increasingly clear that simple methods of inference are potentially 
both ineffective and inaccurate. In large survey datasets various pa- 
rameters influence data in a manner which is not directly obvious 
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and different parameters will have conflicting effects on the obser- 
vations. Consider the example of using star counts to estimate the 
scale length of the Galactic thin disc: a simple minded approach 
might consist of estimating the distance of the stars by 'photometric 
parallax' and then measuring scale lengths from the resulting dis- 
tribution. In this case, the individual distance estimates may well be 
biased and the sample will become incomplete and contaminated in 
a way which is not known. In short there are a number of confound- 
ing factors which prevent the scale length being accurately inferred 
through simple 'inversion'. 

Interstellar extinction is a particular hindrance when studying 
the Galaxy. It builds up gradually, along all lines of sight, in a man- 
ner which is not yet well known. A particular realisation of the 
difficulties interstellar extinction causes is the degeneracy between 
interstellar reddening and spectral type: an apparently red star may 
either be an intrinsically red late type star subject to little extinction 
or a heavily extinguished, intrinsically blue early-type star. There- 
fore, the presence of extinction makes it difficult to determine the 
spectral type and thus distance of stars, considerably complicating 
the study of structure in the Galaxy. 

Furthermore, the 3D distribution of extinction is itself intrin- 
sically interesting. Mapping extinction allows one to trace the dis- 
tribution of dust in the Galaxy. Interstellar dust is itself only one 
component of the ISM, examining its distribution and compar- 
ing it to tr acers of other components of the the ISM, such as CO 
maps (e.g. iDame, Hartmarm & Thaddeus 2001) or HI maps (e.g. 
iKalberla et al.ll2005t) . offers a window onto the ISM and the physi- 
cal processes which shape it. 

Juric et al. I d2008h were able to correct for the effects of extinc- 
tion on their observed stars using the asympto tic Galactic extinction 
map of ISchlegel, Finkbeiner & Davisl dl998h - This approach was 
reasonable as the majority of stars in their sample lie at high Galac- 
tic latitudes, beyond the Galactic dust layer, which is localised near 
the Galactic plane. However, such an approach is not valid for the 
majority of stars in the Galaxy which lie close to the Galactic plane 
and are thus unlikely to be beyond all (or nearly all) the Galactic 
dust in their direction. 

In light of the difficulties interstellar extinction causes and 
the intrinsic value of studying extinction, there has be en an on- 
oing effo rt to map it over ma ny years, going back to Trumpler 
1930) and lvan de Kamd < fl930h noticing the propensity for extinc- 
tion to be stronger near the Galactic plane. Detai led discuss i ons of 
the hi story of extinction mapping are availa ble in ISale et al.l J2009I) 
and Majewski, Zasowski & Nidever J20T lj). Recently, spurred on 
in part by the imminent launch of Gaia and construction of 
other telescopes, and the availability of high quality data, there 
has been a renewed focus on extinction mapping 1 Bailer- Jonesl 
2011; Majewski, Zasowski & Nidevedl201ll ; ISchlaflv et alj|20ld : 



Schlaflv & FinkbeineifeOllI) . 

Most recently iBerrv et af. I l20ll) have used SDSS and 
2MASS photometry to estimate the spectral type, distance and ex- 
tinction towards 73 million stars and so create extinction maps with 
fine angular scales. In common with many previous efforts to study 
extinction, they consider stars in isolation, fitting the parameters of 
a star without reference to the other stars local to it. In doing so 
they do not take advantage of the fact that properties of stars are 
correlated. For example, we know that stars located close to each 
other should be subject to similar extinctions: trivially we would be 
surprised if two stars closely located on the sky and apparently at 
similar distances exhibited the effects of vastly differing amounts 
of extinction. In not employing a method that utilises this informa- 



tion, their results are less precise than what could in principal be 
achieved with their data. 

A cont rasting approach is t o use Galactic models to study 
extinction (Marshall et al. 120061) and Galacti c structure (e.) 
iRobin, Creze & Moharjll992l : lRuphv et al.ll 19961 : ISale et al.ll201C 
this offers several distinct advantages. Specifically, analysing stars 
en masse makes it possible to exploit the physical relationships that 
exist between stars. Moreover, by comparing Galactic models to 
real observations, effects such as Malmquist bias, sample contami- 
nation and sample incompleteness need not impact upon any infer- 
ences, as they will occur in a properly constructed Galactic model 
in a similar manner to how they occur in reality. However, such ap- 
proaches typically involve binning the data in some manner. In do- 
ing so there is some inevitable loss of information, such that results 
obtained will be less precise than those obtained without binning. 
Additionally, such techniques marginalise over the parameters of 
individual stars, which may be undesirable if these are of interest. 



1.1 Hierarchical Bayesian models 

Hierarchies pervade almost all aspects of Galactic astronomy. Triv- 
ially, the Galaxy is comprised of various stellar (e.g the thin and 
thick discs) and non stellar (e.g. dark matter halo, ISM) compo- 
nents. These can be broken down into further subcomponents, stars 
in the stellar case, such that the properties of the stars are depen- 
dant on the properties of the stellar component from which they are 
drawn. Other simple examples of hierarchies include: the distribu- 
tion of the masses of stars in a newly formed star cluster which are 
drawn from some initial mass function (IMF) and the kinematics 
of stars in the Galaxy which are influenced by some global gravita- 
tional potential field. 

Extinction mapping is a directly hierarchical problem: stars 
within a field trace a distance-extinction relationship for that sight- 
line. However, existing methods of 3D extinction mapping do not 
exploit thi s hierarchical nature. For example, in methods such 
as tho se of iNeckel & Klarel dl980h and lArenou. Grenon & Gomezl 
dl992l) . the distance and extinction of each star are first determined, 
without reference to other stars. Subsequently, the final distance- 
extinction relationship was estimated by fitting to the determined 
distance and extinction values for each star. Alternatively, a prop- 
erly constructed method could use information gained from the 
field as a whole (i.e distance-extinction relationship) to help con- 
strain the estimates of parameters of individual stars (i.e. their dis- 
tance and extinction), which in turn can be used to refine our knowl- 
edge of the field. Using this form of 'group knowledge', exploiting 
the correlations which exist between stars, enables parameters to be 
determined more preci sely than is otherwise possible. Therefore, 
INeckel & fflarej dl980h and lArenou, Grenon & Gomezl dl992l) . in 
common with lBerrv et alj ( 1201 II) . obtain results less precise than 
their data are capable of. 

Hierarchical Bayesian models are rich statistical models, they 
extend upon simple Bayesian models by allowing some parame- 
ters in the model to be dependant on other parameters. One can, 
in general, solve for all the parameters in the model, both those at 
a 'lower' level, which may pertain to stars, as well as those at a 
'higher' level which could describe a sightline or the Galaxy. 

On a purely mathematical level, if we have some observation 
z and wish to estimate some parameter 9 we can employ Bayes' 
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theorem: 



P(Q\z) 



P(e|z)=*P(z|e)P(9) 



(1) 

(2) 



It is also possible to define a hierarchical model, whereby 6 itself 
depends on a further parameter, often referred to as a hyperparam- 
eter, (]). Working from conditional probabilities one can obtain: 



P(6,$ 
Also: 
P(®4 



■- p(e,<Sf\z)p(z) 



,z)=p(z\e,®p(e,® 
= p(z\QA)Pm)PW 

Therefore: 



(3) 

(4) 
(5) 



p(e,*|z)«p(z|e,4t)p(e|<»)p(iM (6) 

If z is not directly conditional on (]), this then becomes: 

p(e,0|z)ocp( z |e)p(e|0)p(0) (7) 

One could add a further tier to the hierarchy by introducing 
a further parameter on which § depends, this can be repeated as 
necessary. Also, it is possible to replace the single parameters or 
observations z, 6 and <]) with sets of several parameters. 

It is instructive to consider a simple hierarchical model of two 
tiers: a top tier containing hyperparameters describing a galaxy and 
a lower tier of parameters describing the stars which inhabit the 
galaxy, of which we possess some observations. The form of hi- 
erarchical Bayesian model given by equation [7] can be employed 
because any observations of stars we possess are not directly de- 
pendant on the state of the Galaxy. The power of the hierarchi- 
cal model in this instance is clear, it allows information about the 
galaxy to be estimated directly from the data. That is to say, we can 
solve for the posterior distribution and from that we possess esti- 
mates of certain galaxy-wide parameters that would otherwise be 
difficult or impossible to determine. 



2 MAPPING EXTINCTION IN 3D 

This section describes an algorithm, called H-MEAD (Hierarchi- 
cally Mapping Extinction Against Distance), which seeks to map 
extinction along a sightline and determine the properties of stars 
in this direction. On one level, H-MEAD operates on the idea that 
extinction can be mapped by following the effect it has on stars. 
As such, a Hierarchical Bayesian model is constructed which de- 
scribes the relationship between the observations of stars within a 
field, the parameters of these stars and the extinction distance rela- 
tionship which the stars follow. Then, by solving for the posterior 
distribution it is possible to estimate the parameters of all the stars 
and the distance-extinction relationship. 



2.1 Without interstellar extinction 

First let us start with a simple case, considering an individual star in 
the (assumed) absence of interstellar extinction. Lety,- represent the 
observations we possess for this star. The star has been drawn from 
a catalogue of stars within some field and to distinguish it from 
others in the catalogue it is labelled with i, the need to do so will 
become clear in section |23l The observations could be of any form: 
photometry, astrometry or spectroscopy. However, in this paper 





Figure 1. A pair of directed acyclic graphs (or Bayesian networks) depict- 
ing the dependence of the observable parameters of a star, y t on its physical 
parameters, x;. The two graphs are equivalent, on the left Xj is expanded 
into its component members. These are graphical depictions of the model 
expressed by equation|9] 



only the case of photometric observations will be considered in de- 
tail. In the case of multiband photometry, J> ; = {w\y^\y^\ ■■■}, 
where the w' represent magnitudes in different bands or repeated 
observations. For example, in the case of first year VVV data, with 
one epoch of ZYJH photometry and six epochs in Ks, we would 

have?,- = {Z,Y,J,H,K { s l) ,kP ,#| 3) ,K { s 4 \kP ,£ s (6) }. ForlPHAS 
data, where (almost) every object has been observed twice in its 
three filters, y t = {/( 1 ),/'( 1 ),Ha( 1 ),r'( 2 ), ; '( 2 ),Ha( 2 '}. 

yi is an estimate, with errors, of the true observable parame- 
ters of a star, y ( . Returning to the example of catalogue photometry, 
y ; contains the actual apparent magnitudes of the star. After ob- 
serving the star we then possess an estimate of the star's apparent 
magnitudes, with errors, y t in our catalogue, possibly with some 
observations missing. 

One additional factor that must be included in the likelihood 
is the probability that a star would be included in the catalogue of 
stars studied. That a star makes it into a sample may be a useful 
piece of information, depending on how the sample was compiled. 
Here the event of a given observation of a star being included in the 
sample or not is given by WS; (where j would iterate over different 
bands when employing multiband photometry), these are gathered 
together for each star in the set Sj. 

Then, let contain the physical parameters of the star, in 
which ever way we choose to parametrize them. A simple form 
would be: X[ = {f7W/,T;, [Fe/H] ,-,/*,■}. Where: 3W; is the star's initial 
mass, %j its age and m is the star's distance modulus. Other forms 
could involve substituting effective temperature and surface gravity 
for mass and age. 

We aim to find the posterior probability distribution of the stel- 
lar parameters jc,-, given the observations y r Following from Bayes' 
theorem: 

P{y h Si\xi)P{xi) 



(8) 



p{yi,Si) 

As is convention, this equation can simplified, as the evidence, 
P(yi,Sj), is constant for a single model, such that: 

P{xt$ t ,Si)<*PM]xi)P{xi) (9) 

A graphical description of this equation is given in Fig [T] 
demonstrating how the observations y t depend on the star's physi- 
cal parameters as contained in x v 

2.1.1 Likelihood 

We first decompose the likelihood as follows: 

P{y h Si\xi) =P{S i \y i ,x i )P{y i \x i ) (10) 
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For many purposes it is sufficient to compile samples of stars from 
survey photo metry, perhaps applying colour or magnitude cuts. As 
discussed bv lBurnett & Binnevl feOlOl) . the probability of a star be- 
ing in such a sample is purely a function of its observed photometry 
and therefore constant for different jc,-, but identical y,. Or more suc- 
cinctly Sj is directly dependant on y ; only. As such, it is possible to 
say: 

P{S i \y i ,x i )=P{Sj\y i ) (11) 

Continuing to concentrate on the case of multiband photome- 
try, a given set of stellar parameters x-, implies that the star i has an 
actual (not observed) apparent magnitude, ^'y,, in a given band, j; 

W>i(x,-) =W M(<*4t;, [Fe/H],-) + w (12) 

Where WM is the star's absolute magnitude in that band. Currently, 
H-ME AD obtains these from the Pa dova library of isochrones 
( Marigo etalj|2008t iGirardi et alJboiCh . which is, where neces- 
sary, converted into t he relevant filter s ystem using model spectra 
(specifically those of Munari et al. 2003). 

If it is assumed that the observed apparent magnitudes are nor- 
mally distributed around their actual value, it is then possible to say 
that: 

-(Wyi- U) yi{Xi))\ 



P( U) yi\xi) = 



2ji0')8(*;) 2 



= exp( 



2W8(x,; 



(13) 



Where W8(jCj) is the uncertainty on the apparent magnitude, in the 
band denoted by j, of a star that has the stellar parameters jt; and an 
apparent magnitude ^yj(xj). In practice this is curr ently estimated 
using a similar technique to iMarshall et alj d2006h and I Sale et all 
( 120100 . In short, W8(xj) is determined from: 

Wgfc) = A + ex V (B^ yi (xi) +C) (14) 

Where A, B and C have been determined by fitting to the estimated 
photometric uncertainties of objects within the field, produced dur- 
ing pipeline processing of the observations. 
Thus, for observations we have obtained: 

P{ U) Si\ U) yi)P( U) 



'ym) 



As the first term depends only on an observation, this is constant 
and so: 



exp( 



(16) 

Where N(xi) is a normalisation factor. 

However, in real catalogues of data individual stars may be 
missing one or more observations which other stars may possess. 
For example, in a photometric catalogue a star may lack an obser- 
vation in one band because it was too faint to be observed in that 
band. In this case will record the absence of an observation. 
We would like to repeat the process of the previous few paragraphs, 
but are unable to do so as we do not know what the unobserved ap- 
parent magnitude of the star should be. Therefore, we marginalise 
over the missing observation 



P(Wy , i Ws i \x i )d^y' i 



fp^S^y'dP^y'iMd^y'i 
J /^n^ ? eXP 



2W5(x,-) 2 ' y ' 



(17) 
(18) 

(19) 



In the middle of the catalogue's magnitude range P{ U) Si\^fi) will 
be roughly 0, whilst far outside (very bright or very faint) it will 
be almost 1. At the bright end it is normally possible to approxi- 
mate P(V>Si\(j>y'j) as a step function, thus for bright stars equation 
[19] can be approximated with the normal cumulative density func- 
tion. However, at the faint end, the form of P(v> Sil^'y^) is better 
described with a sigmoid function, the integral in equation[T9lmust 
then be solved numerically. 

Furthermore, if it is assumed that all members of y, are inde- 
pendent, as would be reasonable for photometry, but not for cer- 
tain forms of spectroscopically derived measurements based on the 
same spectrum, the likelihood can be written as: 



(20) 



Where k iterates over observations we have and / those that are 
missing for each star. 

2.1.2 Prior 

There are several contributions to the prior. For mass we can take 
the IMF as a prior, in particular here a Scalo-li ke IMF is assumed, 
as it is more suitable for field populations ( Kroupa & Weidned 
2003). For age, one could assume the star formation rate, which 
has been taken to be constant. A radial metallicity gradient can be 
included in the prior on metallicity, whilst the prior on distance can 
account both for the radial density gradient of the disc and the ge- 
ometry of the field (the physical area covered by a source with a 
finite apparent angular size at a given distance is proportional to 
the distance squared). As such the prior takes the following form: 



P(Mi) oc Mr 11 
P(%i) °= const. 
P([Fe/H]i) oc -0.07P, 

P(Vi)<xp(Mi,k,bi)d? 



(21) 
(22) 
(23) 
(24) 



(15) P( Xi ) = P{<Mi,i u [Fe/H]i,//) = P(^)P(xi)P([Fe/H| j )/ , («) (25) 



Where Rj is the Galactocentric radius in kpc implied by m and the 
Galactic coordinates of the object and d/ is the distance implied by 

m. 

p(/Ji,li,bj) is the stellar density at a distance modulus of pi 
in the direction of Galactic coordinates (/,■,£>;)• The form of this 
can be set as desired. H-MEAD, as employed in section[4] models 
p(/Ji,li,bj) with a thin disc only, as the contribution of ot her compo- 
nents will be negligible for the cases studied. Following lSale et al] 
fcoich the thin disc is modelled as an exponential truncated disc 
with an inner scale length of 3000 pc, a truncation radius of 13 kpc 
and an outer scale length of 12 00 pc. Additiona lly a scale height 
of 300 pc is assumed (following I Juric et al.l l2008). As discussed in 
section [5] a substantially more complicated model for p(/j,, /,-,&,), 
featuring a bulge, thick disc and halo as well as warp and flare in 
the discs could be employed if desired and it is possible that obser- 
vations could be used to constrain p(/J;, h,bi). 

Two powers in the dj term in the prior on distance are a result 
of the fact that all sources subtend non-zero solid angle and that 
the area covered by this solid angle is proportional to df. The final 
factor of d/ arises in the Jacobian when converting the prior from 
one on distance to one on distance modulus. 

The Bayesian model de scribed so far is similar that of equation 
6 of iBumett & Binnevl fcOlOh. However, there are two key differ- 
ences :|BuniItL&jiIimiy. 1 2010h employ the uncertainties estimated 
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Figure 2. Similar to Fig.[T] though now the model includes extinction. 



from the observations and then require an extra term to relate these 
to the uncertainties that would be implied by a given jc,-. Whereas, 
here uncertainties are estimated directly from a;, . They also include 
a separate term which covers the probability of the star being in- 
cluded in the sample, given some x,-. It should be noted that this 
is included in the likelihood, thus reconciling their equation 6 and 
Bayes' equation. 



iGoodman. Pineda & Schneel J2009I) . iFroebrich & Rowlesl d201 d) 
and iKainulainen et al. 1 20 1 lb have observed that the lognormal 
form of the column density of ISM components appears to be valid 
empirically in interstellar clouds. A lognormal distribution can be 
intuitively seen to be preferable to a Gaussian distribution, as it is 
only denned for positive values of extinction, i.e. it ensures that 
there is no probability of the A, being less than or equal to zero. 

Furthermore, let us describe the distance-extinction relation- 
ship A = {A(d),a^(d)} with piecewise constant function (some- 
times called a step or staircase function) for both A(d) and GA{d). 

Therefore, at a given distance, one can say: 

1 .(logA,--^,-)) 2 



= exp( 



2v(di] 



-)p(wM z 



(27) 



A,V27TD(d/) 2 

Where and X>(d) describe a lognormal distribution with mean 
A(d) and standard deviation O^d). 

The definition of the overall prior is similar to equation[25l 



P{x i )=P(M i ,% i ,{Ve/n] i ,Li i ,A i ) 



:P(!W;-)/ , (x i -)/'([Fe/H] / )/'(A i , w ) 
(28) 



2.2 Interstellar extinction, with a known distance-extinction 
relationship 

The Bayesian model is now extended, by also considering the pos- 
sibility of interstellar extinction. To do so, x, is altered, such that: 

^{^Ti.pe/H]/,/^ Ail- 
Note that Aj is monochromatic extinction at a given wave- 
length. Monochromatic extinction is employed in this study as all 
broadb and measures of extinction are d ependant on the SED of the 
source jMcCallll2004l ;l Bailer- Jonesll201 if) . As a result, two sources 
sitting behind the same dust column, but with different SEDs will 
demonstrate differing Ays (for example), but necessarily identical 
values of A,-. From this point on, the word 'extinction' and symbol 
Aj should both be taken to refer to monochromatic extinction. 

The equation for the posterior probability distribution (equa- 
tion^ remains the same, though the definition of Xj has altered. An 
updated graphical description is depicted in Fig. [2] 

The definition of the likelihood given by equation [20] also re- 
mains unchanged. However, the actual apparent magnitude of a star 
,), given stellar parameters is now: 



U) 



-U) 



M(%,Zi, [Fe/H];) +fn +0) R(xi)Ai 



(26) 



Where ^R(x,) =W A(x,)/A; and is the ratio of broadband ex- 
tinction in the filter of observation j and the monochromatic ex- 
tinction for_a_s^a£_wjth_p_arametersx i . Here this ratio is found us- 
ing the Fitzpatrick & Massa (2007) R = 3.1 reddening law and 
Mun ari et alj d2005t> model spectra. 

But what of the prior? Let us say that we know a priori 
the distance-extinction relationship for the field. As all catalogues 
will inevitably cover some non-zero solid angle, at any given dis- 
tance there will be a range of possible values of extinction, as the 
dust column will vary with sky position. This is the effect that 
causes the well known problem of differential extinction in clus- 
ters, where individual stars within a cluster are subject to differ- 
ent extinctions, blurring the appearance of the cluster on colour- 
magnitude diagrams. We model the differential extinction with a 
lognormal distribution: at a distance d, the distribution of extinc- 
tion is given by a logn ormal distribution with m ean A(d) and stan- 
dard deviation 04 (d). Fisc hera & Dopital d2004) show, with simu- 
lations, that the distribution of extinction to a given distance, with 
varying angular position, takes a nearly lognormal form, whilst 



2.3 The case of an unknown distance-extinction relationship: 
3D extinction mapping 

In practice we do not know the distance-extinction relationship a 
priori, in fact we would like to determine it. To this end, we further 
extend our model to become a hierarchical Bayesian model. Under 
this description, we allow the parameters contained in A, which 
describe the distance-extinction relationship to vary, referring to 
them as hyperparameters. This makes it possible to constrain the 
distance-extinction relationship, in addition to the stellar parame- 
ters. If only the observations of one star are employed, very little 
will be learnt of the distance-extinction relationship. However, as 
the distance-extinction relationship applies to many stars within a 
field, it can be more precisely determined when using the obser- 
vations of many stars within the field. Therefore, the tightest con- 
straints on A are found when using observations of as many stars 
as is feasible. 

Fig.[3]shows a graphical description of the hierarchical model 
that is used in H-MEAD. As can be seen, the stellar parameters, 
Xj, of each star in a field are dependant on the distance-extinction 
relationship that they sample. In practice, this dependence is what 
allows the distance-extinction relationship to be found. 

In the previous sections it has been possible to consider each 
star separately, as the parameters of one star and those of another 
were mutually independent. However, now the model includes 
many stars from within a field. Therefore, for increased clarity, 
three additional sets are defined, x — {x\ ,xi ,...}, y = {y 1 ,5*2, ■•■} 
and S = {S\,S2, ■•■}• These sets contain the stellar parameters of all 
the stars, the observations of all the stars and the set of the events 
that each star is included in the sample respectively. 

Given the model depicted in Fig. [3] and remembering equa- 
tion|7] the global posterior distribution is now defined as: 

P(x,A\y,S) - P(y,S\x)P(x\A)P(A) (29) 

The likelihood and prior are each simply the product of the likeli- 
hood and prior respectively for all stars: 



P(y,S\x)=]JP($ i ,S i \x i ) 

i 

P(x\A)=HP( Xi \A) 



(30) 
(31) 
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Figure 3. A directed acyclic graph depicting of a hierarchical model that 
combines the distance extinction relationship and stellar parameters. We 
possess observations y of some stars. These observations are dependant on 
the physical parameters of the stars, x, which are themselves dependant on 
the distance-extinction relationship A. 

It now becomes necessary to set a hyperprior - a prior on 
the hyperparameters, P(A). We know that mean extinction (A(d)) 
must increase with distance. Therefore, this requirement is in- 
cluded in the hyperprior, by setting the hyperprior probability of 
any distance-extinction relationships that do not satisfy this re- 
quirement to zero. 

It was found experimentally that without any further informa- 
tion in the hyperprior, i.e. assuming an otherwise uniform hyper- 
prior, the posterior distribution would not be well behaved. In par- 
ticular, that A(d) would diverge to infinity at large distances. This 
is a result of a common feature of hierarchical models that, in or- 
der for the posterior to be well behaved, the hyperprior should be 
proper, that is that the integral of the hyperprior across all hyper- 
parameter space should be finite. The hyperprior described in the 
previous paragraph does not satisfy this requirement. 

Therefore, an improved hyperprior was required. As with the 
priors, the hyperprior is based on existing physical knowledge. 
Specifically, it is assumed that extinction tra ces an exponent i al disc , 
with scale heights and lengths taken from Marshall et aD 12006). 
The requirement that mean extinction must increase with distance 
is also retained in the hyperprior. 

2.4 The hierarchical model in practice 

Unfortunately, there exists no analytical solution for the hierar- 
chical model, as given in equation [29] Similarly, a 'brute force' 
method, which covers the entire parameter space is clearly unfea- 
sible given the very large number of parameters included in the 
model. Therefore, a Markov-Chain Monte Carlo (MCMC) method 
is employed in H-MEAD to estimate the posterior distribution and 
so, values of each of our parameters. 

As a result of the high dimensionality o f this problem, a stan- 
dard Metropolis-Hastings MCMC algorithm ( Hastings 1970) is not 
feasible: to maintain a reasonable acceptance rate the proposal dis- 
tribution would have to be so narrow that it would take an imprac- 
tically large number of iterations for th e chain to conv erge. Instead 
a Metropolis-within-Gibbs algorithm iTierney|[T994h is adopted. 



The schema for this is that during each iteration of the chain, each 
Xi is updated in turn, using a Metropolis-Hastings sampler, then 
the distance-extinction relationship A is updated, again using a 
Metropolis-Hastings sampler. It is worth noting that such 'block- 
updati ng' algorithms were originally discussed bv lMetropolis et alj 
CHI) and therefore such an approach should not be viewed as be- 
ing particularly radical or exotic. 

The feasibility of the Metropolis-within-Gibbs approach is re- 
liant on the fact that each star's contribution to the posterior proba- 
bility is dependant only on the x, corresponding to it as well as the 
distance-extinction relationship. It is independent of the rest of x, 
as visualised in Fig[3] Thus, when updating a star's parameters only 
its contribution to the posterior probability need be recomputed. It 
is only when updating A that the contribution of all stars to the 
posterior need be recomputed and even then it is only necessary to 
recalculate P(x\A)P(A). By adopting this method it is also possible 
to maintain both a reasonable acceptance rate of updates and a suf- 
ficiently wide proposal distribution. As a result, the chain quickly 
converges on the posterior distribution and subsequently samples 
from it. 

The exact performance of any MCMC based algorithm is 
partly dependant on a number of parameters that describe how it 
behaves, including the proposal distributions used and the number 
of iterations the algorithm is allowed to run for. The width of the 
proposal distributions of the stellar parameters is based on the ap- 
parent magnitude of the star: brighter stars being given narrower 
proposal distributions to reflect the fact that their parameters can be 
more precisely determined. A 'burn in' period of 100000 iterations 
is assumed; the state of the chain in the first 100000 iterations is not 
used to estimate the posterior distribution as it is assumed that the 
state of the chain in these iterations may be affected by the initial 
values of the parameters. H-MEAD is then allowed to run for a fur- 
ther 150000 iterations, making a total of 250000. The large number 
of iterations, coupled with the high dimensionality of the parameter 
space, means it is unfeasible to store the state of the chain at every 
iteration, instead it is 'thinned' by only retaining the state at every 
100 th iteration. 

Up until this point, Xj has been defined as 
{5W/,t/, [Fe/Hj/.jU^A,-}. However, in practice, it is not possi- 
ble to use this parameter space as it is extremely difficult for a 
star's Xi to transition from the main sequence to the giant branch: 
to do so would require the chain to follow a very particular path 
in {5W/,T,} space, as such it would take a prohibitively long time 
for the chain to converge. Therefore, the stellar parameters for 
each star are parametrized as Jt; = {7] ett , logg;, [Fe/H];, /i^A,-}. 
Everything stated so far in this paper still stands, with the exception 
that it is necessary to transform the prior probability distribution 
from {fMf,%i, [Fe/H],-,ft-,Ai} space to {T? e ,lo Sgi , [Fe/H];,^;} 
space, using the Jacobian determinant. 

No mention has yet been made of the effect of stell ar binarity 
or higher order multiplicity. Following ISale et al] d2009t) . a simple 
correction was made to the estimated distances to all stars such that 
they are 4% further than would otherwise be calculated. 

One could marginalise over x, so that only the distance- 
extinction relationship and not any of the stellar parameters are 
recovered. In practice though there is little to be gained directly 
from such an approach, marginalising over the stellar parameters 
in an MCMC algorithm is no quicker than constraining these pa- 
rameters. A performance in crease can only be ob tained by binning 
the data in some form (as in lMarshall e7aU2 006), which induces a 
loss of precision. 
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Figure 4. The posterior probability distributions of distance and extinction 
for an AO star at 2 kpc in three different filter systems. Top VISTA, middle 
SDSS and bottom IPHAS. 



3 OBSERVATIONS 

This paper concentrates on the use of multiband photometry. Par- 
ticular reference has been made to the use of observations from 
IPHAS and VVV, two surveys of the Galactic Plane, where the 
need for extinction mapping is most acute. It is, though, obvious 
that the method could be employed with any photometric survey. 
The hierarchical model employed remains the same, but suitable 
isochrones must be obtained and the response to extinction in all 
filters computed. 

It is, however, important to note that not all observations are 
equally informative. To demonstrate this photometry was simulated 
for an AO star on the ZAMS, located at a distance of 2 kpc, in the 
VISTA, IPHAS and SDSS filter systems. This simulated star is as- 
sumed to lie towards the anticentre and has been subjected to an 
extinction of A6250 = 2, that is an extinction equiv alent to 2 mag at 
6250 A, assuming the|? itzpatrick & Massal J2007t) R = 3.1 extinc- 
tion law. This simulated star was then analysed using the method 
of section [2721 In the first instance a uniform and therefore unin- 
formative prior on extinction was assumed. This is a simple case 
and is analogous to extinction mapping methods which do not use 
the distance-extinction relationship to refine the estimates of stellar 
parameters. The subsequent posterior probability distributions for 
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Figure 5. A comparison between the posterior distribution for VVV data in 
Fig.|4](black solid line) and the distribution of the true distances of simu- 
lated stars with apparent magnitudes consistent with that of the AO star (red 
dashed line). The posterior distribution for IPHAS data is depicted with the 
blue dotted line for comparison. 



the distance and extinction of the stars, marginalised over all other 
variables, are shown in Fig.|4]and summarised in tableQ] 

Although all three systems obtain reasonable estimates of the 
star's distance and extinction, the IPHAS data allows the posterior 
to be significantly more constrained and thus, demonstrably, carries 
more information. The particular cause for this is due to the design 
of the filter sets, the SDSS and VISTA systems are more affected by 
what is often referred to as a degeneracy between estimated spec- 
tral type (and thus equivalently estimates of T e ff or mass) and es- 
timated extinction. Due to the nature of the IPHAS filter system, 
which allows (/ — Ha) to be a rough proxy for spectral type, this 
degeneracy is less dramatic in the estimate derived from IPHAS 
photometry. 

In this case, it would appear that all three systems produce a 
biased estimate of distance. However, this arises as only one star 
is being considered and it has been chosen to be on the ZAMS 
and will therefore be intrinsically faint for a star of its colour. It 
is of course possible to have an intrinsically brighter star of the 
same colour, say one that is turning off the main sequence. Such 
stars will be less common, but in order for them to have the same 
apparent magnitude as the hypothetical AO ZAMS star at 2 kpc, 
they will have to be more distant. The possibility of the observation 
being of such a star elongates the posterior distribution. In effect, 
the estimate of the star's absolute magnitude or logg is not well 
constrained. When a whole population of stars is considered, on 
average this effect disappears as the real distribution of these stars 
will resemble the posterior. 

To verify this we turn to a simulation of VVV photometry, 
of the type discussed in section [4] though with extinctions of all 
amounts assumed equally likely, was performed for a region of 200 
square degrees. A sample was then populated with all the stars with 
apparent magnitudes in all three bands within 0.02 that of the AO- 
star. Fig. [5] shows a normalised histogram of the true distance of 
the stars in this sample compared to the posterior distribution for 
the A-star obtained using simulated VVV data marginalised over 
all parameters except distance. Fig.[5]reveals that the posteriors in 
Fig. [4] are not biased, but are an honest reflection of what can be 
determined about the star using the data at hand. 

H-MEAD, however, does not assume a uniform prior on ex- 
tinction, but rather, as indicated by equation(29] uses the estimated 
distance-extinction relationship. To illustrate the effect of assuming 
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Table 1. A summary of the histograms in Figs. |4]|6] and [8] Columns show the mean of the posterior distribution, and a 68% credible interval centred on the 
mean. 



Without Gaia, uninformative prior on A, d Realistic prior on A, d With Gaia 

Data Mean CI Mean CI Mean CI 
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Figure 6. Similar to Fig. [4] though now the depicted (red line) distance- 
extinction relationship has been employed to set the prior probability. Dif- 
ferential extinction has been set by assuming o"a(^) =A(d)/5 



a more informative prior on extinction, the analysis that produced 
Fig.[4]was repeated, though now with a typical distance-extinction 
relationship used to set the prior. The posteriors and the assumed 
distance-extinction relationship are depicted in Fig.[6]and the pos- 
teriors summarised in tableQ] The resultant priors are significantly 
more precise, demonstrating the value of the hierarchical approach 
pursued by H-MEAD. In particular, the extra information from the 
prior now allows logg to be more accurately determined, as can be 
seen in Fig. [7] 

The method described here can be applied to almost any form 
of astronomical observations. Although this work largely concen- 
trates on the use of photometry, it is possible to also use spec- 
troscopy or spectroscopically derived measurements, for example 




Figure 7. A comparison between the marginalised posterior distributions on 
log g determined with an uninformative uniform prior (solid black line) and 
a more informative prior (dashed red line) with SDSS data, and with com- 
bined SDSS and Gaia data (blue dots). The surface gravity of the simulated 
star is logg = 4.28 



one could use the spectra of SEGUE or RAVE and/or the esti- 
mated stellar parameters derived from them. Though this comes 
with the caveat, as mentioned in section 12.1.11 that care must be 
taken with the likelihood when multiple measurements, for exam- 
ple those produ ced by the Segue Stellar Parameter Pipeline (SSPP 
iLee et alj |2008). are derived from the same spectra. Additionally, 
unlike photometric surveys which blindly observe all visible ob- 
jects within a field of view, targets for spectroscopic surveys are 
preselected, with preference being given to certain types of object. 
Therefore, it is necessary to understand this selection function and 
include it in the likelihood. 

It is also possible to use astrometric observations, parallaxes 
being of particular interest in this context. Though they may appear 
to only constrain stellar distances directly, as estimates of other pa- 
rameters will be correlated with the estimated distance, they too 
will be constrained by parallax measurements. As an example, 
Fig [D returns to the example of an AO star at 2 kpc, subject to an 
extinction of A6250 = 2. The esti mated uncer t ainty on the Gaia par- 
allax has been derived following lOndeg ren (2009). The inclusion 
of a Gaia like parallax not only allows the distance estimate to be 
much more tightly constrained, but also improves the precision of 
the extinction estimate substantially. Furthermore, as the distance is 
more tightly constrained, so too are estimates of the star's absolute 
magnitude and logg. 

Clearly, the ideal approach would be to use all data available 
and so obtain the most precise results. In practice though, there is a 
cost in both CPU time and human effort associated with the use of 
any individual survey dataset, stemming from the need to configure 
isochrones and the response to extinction and process the observa- 
tions in H-MEAD. A sensible approach would be to prioritise the 
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Figure 8. Similar to Fig. [4] However, now a simulated Gaia parallax has 
been employed in addition to the photometry. 



use of more informative surveys over those which will be of less 
use. 
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Figure 9. The solid red lines show the extinction distance relationships 
assumed in the production of synthetic photometry. The black error bars 
show the extinction distance relationship obtained by H-MEAD when run 
on the synthetic photometry. The sightlines shown are, from top to bottom: 
(l.b) = (90,2), (135,2), (180,0). The number of stars in the simulated cat- 
alogues are 1766 for (90,2), 1246 for (135,2) and 977 for (180,0). 




4 VERIFICATION 

It is necessary to assess the veracity of the H-MEAD algorithm, to 
measure its precision and demonstrate its accuracy. This has been 
done through the use of simulated photometry, as it is possible to 
know the exact distance-extinction relationship and the stellar pa- 
rameters of the stars in the simul ation. This process is similar to 
that described in lSale et alj J2009|>. The Galactic model employed 
is largely identical to that used in lSale et alj ( |2010|) . which itself fol- 
lows in the spirit of the Besancon model. In broad terms, the model 
seeks to simulate photometry for an observer in a galaxy similar to 
the Milky Way. The model includes features such as photometric 
errors, binarity, metallicity variation and extinction. 

As a first test IPHAS photometry was simulated for stars using 
a Galactic model that features no stellar binarity and a fixe d metal- 
licity at solar levels. Subsequently, as in lSale et al.1 ( 120090 . colour 
cuts were applied to the simulated data to exclude stars with colours 
consistent with having a spectral type later than K4. A number of 
Galactic plane sightlines were examined (/ = 45,90, 135, 180 and 
b = 0,2, 5) with the expectation that these should be broadly repre- 



sentative of the different regimes encountered in the Galactic plane. 
All simulated sightlines had an angular size of 5' x 5', a bright pho- 
tometric limit at r' = 13 and a faint limit at r' ~ 20.5. In all cases 
H-MEAD successfully retrieved the input extinction distance rela- 
tionships, examples can be seen in Fig|9] 

In Fig. [9] and subsequent similar figures, error bars are em- 
ployed to depict 68% credible intervals on the mean extinction in 
each distance bin. A credible interval contains a given proportion of 
the posterior probability distribution on one parameter, marginalis- 
ing over all other parameters, and can be viewed as being an ap- 
proximate Bayesian equivalent to a confidence interval. In this in- 
stance the credible intervals have been estimated from the chains 
created by the MCMC process. Credible intervals, like confidence 
intervals, could be placed anywhere in the parameter's range as 
long as they contain the desired proportion of the posterior prob- 
ability. In H-MEAD and this paper, for the sake of simplicity and 
consistency, all credible intervals are centred on the mean value and 
are symmetric. Note that these error bars do not show differential 
extinction, CT^(d). 

To verify that the means and credible intervals are an ade- 
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Figure 10. The estimated and true values of A(4kpc) (solid black line), 
A(5kpc) (dashed red line) and A(6kpc) (short-dashed blue line) for 200 dif- 
ferent runs of H-MEAD, using different photometric catalogues. The typi- 
cal half-width of a 68% credible interval on A is ~ 0.05 at 4 kpc and ~ 0.08 
at 5 and 6 kpc. 



Figure 12. Demonstrating the effect of neglecting metallicity variation in 
stars. The solid line shows the distance-extinction relationship employed to 
simulate photometry. The crosses show the distance-extinction relationship 
obtained by H-MEAD, if it assumes all stars have solar metallicity, whilst 
the circles show that obtained if metallicities are allowed to vary. 
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Figure 11. The estimated value of A(5kpc) for 200 different runs of H- 
MEAD, using the same simulated photometric catalogue. The standard de- 
viation of this distribution is 0.009. For comparison, the 68% credible inter- 
val on A(5kpc) in each of the 200 runs is typically 0.08. 

quate summary of the posterior on mean extinction, the results of 
repeated use of H-MEAD were examined. To this end, 200 differ- 
ent synthetic photometric catalogues were simulated for the same 
sightline, (l,b) = (180,0), H-MEAD was subsequently run on each 
catalogue and the results analysed. Fig[l0]shows the distribution of 
the means of the posterior on mean extinction at three different dis- 
tances and demonstrates that the means and credible interval give a 
reasonable description of the posterior. 

One key feature of the extinction curves in Fig. [9]that should 
be noted is that values of mean extinction in successive bins are 
highly correlated. As a result, it may appear that the credible in- 
tervals for mean extinction, displayed as error bars, are underesti- 
mated in regions. For example at a distance of ~ 2.5 kpc along the 
(180, 0) sightline shown in Fig. [9] where the estimated extinction is 
lower than the true value. However, in reality, as the mean value of 
extinction in successive bins is so strongly correlated the probabil- 
ity of successive bins lying roughly one credible interval below the 
actual value is only slightly less than that of one bin being similarly 
placed. 

Using the simulated photometry is was also possible to ver- 
ify that, in all cases, H-MEAD was able to converge on a result 
quickly, typically within the first 10,000 iterations; allowing for 
a burn in period of 100,000 iterations was found to be somewhat 



conservative. As MCMC algorithms sample from the posterior dis- 
tribution, quantities derived from the MCMC chain are subject to a 
standard error. In addition samples drawn from an MCMC chain are 
not independent, thus the effective size of such a sample is smaller 
than its absolute size. As a result of the high dimensionality of the 
statistical model employed in H-MEAD, the MCMC chains mix 
somewhat slowly. This, in turn, reduces the effective sample size of 
the chain and thus increases the standard error on the estimates of 
the distance extinction relationship and stellar parameters. For this 
reason, the long chains are necessary. In the tests conducted effec- 
tive sample sizes were typically a few hundred. In comparison, the 
thinned post-burn-in chain contains 1500 iterations and is itself ob- 
tained from the unthinned post-burn in chain of 150000 iterations. 
Fig. QT] demonstrates that the standard error on the estimated pa- 
rameters is acceptably small compared to the width of the posterior 
distribution, thus it can be neglected. 

Colours in the IPHAS system are largely independent of 
metallicity, mag nitudes though a re somewhat affected by metal- 
licity variations dSale et al] f2009). As such, the distances to stars 
estimated with IPHAS photometry are dependant on the contents 
of the prior. If photometry wi th colours sensitiv e to metallicity, e.g. 
the («' - g') colour in SDSS dlvezic et alj2 008). were used instead 
the metallicity of the stars would be tightly constrained by the data 
and less dependant on the metallicity gradient in the thin disc as- 
sumed in the prior. 

The importance of the assumed thin disc metallicity gradient is 
demonstrated in Fig.Q/2] IPHAS photometry was simul ated, assum- 
ing a metallicity gradient of —0.07 kpc -1 (following Rob in et aT] 
2003). If H-MEAD assumes this same metallicity distribution in 
the prior, it accurately retrieves the distance-extinction relation- 
ship. If, however, it assumes that all stars are of solar metallicity, 
it overestimates distances significantly. Smaller changes impact on 
the results far less significantly, for example assuming a flattening 
of t he metallicity gradient at large Galactoc entric radii (following 
e.g. lCarraro et alJl200llBragagha et alj|2008l) . 

H-MEAD also estimates the stellar parameters for every star 
in the sample. The precision with which it is able to do so is in- 
evitably dependant on the apparent magnitude of stars. However, it 
is also affected by the characteristics of the sightline: it is possible 
to estimate the distance of a star in a region densely filled with dust, 
where extinction is building up rapidly, more accurately than that 
of a star in a region where the dust content is low. 
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Figure 13. Comparing the performance of MEAD and H-MEAD. 
Again, the sightlines shown are, from top to bottom: (l,b) = 
(90,2), (135,2), (180,0). The number of stars in the simulated catalogues 
are 1692 for (90,2), 1257 for (135,2) and 952 for (180,0). The results ob- 
tained by H-MEAD are shown with black error bars and those of MEAD 
with broader blue bars, with a circle at the mean value. 

Finally a direct compa rison is made b etween the performance 
of H-MEAD and MEAD dSale et alj|2009l) . Both algorithms were 
run on the same simulated photometry, some examples are shown in 
Fig-E] In all cases H-MEAD produces clearly far superior results. 

H-MEAD has already been used with real data. Raddi et al. (in 
prep.) applied H-MEAD to IPHAS photometry along lines of sight 
towards 68 Classical Be stars. Armed with the produced distance- 
extinction relationships and sp ectroscopically estimated redden- 
ings, 'extinction distances' (e.g. iGiammanco et al.ll201 ll) are esti- 
mated for each star. These are then compared to distances estimated 
purely from photometry and so the comparison between the two 
constitutes a strong test of the capabilities of H-MEAD. 



5 FURTHER EXTENSIONS TO THE MODEL 

In the hierarchical model employed by H-MEAD, as specified by 
equation [29] and Fig. [3] there are a number of fixed parameters. 
These are found in the prior probability distribution and include 
the shape of the IMF, the stellar density distribution, the Galactic 
SFH and how metallicity varies with position in the Galaxy. Poor 



choice of these priors will lead to systematic error in the estimates 
obtained by H-MEAD. 

Although it has not yet been done, it is in theory possible to 
further extend the model such that all these parameters are them- 
selves allowed to vary and thus we gain additional hyperparame- 
ters. As an example, we could allow the exponent in the assumed 
IMF to vary. Extending our model in this way would, in the worst 
case scenario, allow for marginalisation over different IMFs. How- 
ever, if the data to hand are suitably informative, it would be pos- 
sible to more tightly constrain the shape of the IMF. By perform- 
ing this form of analysis, the possibility of systematic error will be 
much reduced. 

In the case of stellar density, one could employ a more compli- 
cated model. The prior employed in section [4] only includes a sim- 
ple exponential thin disc when describing stellar densities. This is a 
good model at low Galactic latitudes, but will be increasingly insuf- 
ficient with increasing distance from the Galactic plane. One could 
therefore employ a model which also includes thick disc, bulge and 
halo components. Additionally, it would also be possible to include 
warping and flaring of the thin disc in the model. Again the param- 
eters that describe the warp and flare could be allowed to vary and 
so, the observational data could constrain the form of the thin disc's 
warp and flare. 

T he model cur r ently employed as sumes an R = 3. 1 reddening 
law o f iFitzpatrickl d2004l) . following ICardelli, Clayton & M athis 
(1989). Clearly, this is a somewhat simplistic approach, given that 
it is well known that the extinction law varies across the sky. An 
alternative approach might be to allow the extinction law to vary 
between fields. However, this too can been seen to be simplistic: if 
the extinction law is allowed to vary between adjacent fields, why 
shouldn't it vary within a field? A far more comprehensive would 
be to model R in the same way as extinction is modelled in this 
paper. Namely, that within a field, at a given distance, R has some 
distribution with non-zero width and this distribution is allowed 
to vary with distance. Again this approach requires a hierarchical 
model, each star has its own value of R, which samples the distri- 
bution of R along the line of sight. 

The value of this approach can be most easily seen with a sim- 
ple example: Imagine that within some field the first few kpc are 
dominated by 'normal' R = 3.1 dust, then at a distance of a few 
kpc there exists a dark cloud, containing high R dust covering part 
of the field. Behind the dark cloud R = 3.1 dust is again dominant. 
Assuming R = 3. 1 would clearly be unsatisfactory for stars behind 
the dark cloud, whilst taking some field averaged value of R would 
not be representative of either those stars obscured by the cloud or 
those not. In this case allowing R to have a distribution with non- 
zero width and allowing this distribution to change with distance 
would capture the existence of the cloud at some depth along the 
line of sight and cope with the variation of extinction law with an- 
gular position, caused by the cloud not covering the entire field. 

In addition, by not treating stars individually, it is possible 
to take advantage of the fact the stars located near each other 
should exhibit correlated values of R as they lie behind similar dust 
columns. As a result, the estimated values of R for each star and 
for the field in general will be more precise than if the stars were 
treated individually, in much the same way as the hierarchical ap- 
proach allows extinction to be measured more precisely. 

An example of a model which could be employed to study 
Galactic structure as well as map extinction in three dimensions 
is shown in Fig. [14] This model is somewhat more complicated 
than that employed by H-MEAD and as such will require superior 
data to constrain the additional parameters. Additionally, it is likely 
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Figure 14. A graphical description of an extended hierarchical model which 
can also be used to study Galactic stellar density (p) and 3D reddening law 
variation (R). 



that its implementation will present a number of additional practical 
hurdles, akin to those discussed in section[2l4l 



6 CLOSING DISCUSSION 

Given the high frequency of hierarchies in Galactic astronomy, hi- 
erarchical Bayesian models will be a key tool for analysing the ever 
growing quantity of data on our Galaxy. The advantage of using hi- 
erarchical models is clear. Such models relate the parameters of 
many stars, through the characteristics of the Galaxy or the region 
in which they are found. Therefore, they provide additional infor- 
mation and allow the parameters which describe stars and those 
which describe the Galaxy to be determined more precisely and ac- 
curately. Conversely, methods which consider stars individually do 
not take advantage of the relationship between stars and the Galaxy 
they populate and cannot take full advantage of the data they em- 
ploy. 

Furthermore, employing a Bayesian methodology offers sev- 
eral advantages over classical frequentist techniques. Bayesian 
techniques avoid the classic inverse problem, whereby the pa- 
rameters of interest are determined directly from the data, by in- 
stead finding the model which best describes the data. Such in- 
version can be particularly difficult when there are many param- 
eters involved and their estimates are correlated. Additionally, with 
Bayesian techniques one is able to include prior information, in this 
case the basic physics of extinction and the accumulated historical 
knowledge of the Galaxy can both be included to give more precise 
and accurate results. 

This paper has examined one potential use of hierarchical 
Bayesian models, employing them to map extinction in three di- 
mensions. A good knowledge of where and how extinction builds 
up in the Galaxy is a major barrier to furthering our understand- 
ing of the structure an history of the Galaxy. There are few ex- 
isting 3D extinction maps and those that do exist are limited by 
one or more of several factors, including the data they employ (e.g. 
iNeckel & Klare fl980) a reliance on a particular Galactic model 
i Marshall et alj 2006th by only being able to achieve a coarse dis- 



tance resolution (e.g. iMaiewski. Zasowski & Nidevei 2011 ) or by 
only treating individual stars in isolation (e.g. Berry et alj|201ll) . 

A particular algorithm, H-MEAD, which employs hierarchical 
Bayesian models has been described in detail. The model employed 
has been discussed, concentrating on the physical justification for 
its use, but also covering some of the practical aspects of its im- 
plementation. Subsequently, testing with synthetic photometry has 
demonstrated that H-MEAD does indeed produce results which are 
both accurate and precise. The pr ecision of H-MEAD is massively 
improved with respect to MEAD dSale et alj [2009). whilst the fact 
that H-MEAD is able to cope with features such as variation in stel- 
lar densities and metallicities will considerably reduce systematic 
errors. 

It should also be remembered that H-MEAD not only maps 
extinction, but also provides precise estimates of the distances, 
masses, metallicities, etc of stars. There are many potential uses 
for such data, one possibility is studying the distribution of high 
mass stars in order to map the Galaxy's spiral structure. 

In the longer term, it is not possible to consider the future use- 
fulness of the method described in this paper and hierarchical mod- 
els in general without mentioning future surveys. Gaia is expected 
to provide parallaxes for ~ 10 9 stars as well as spectroscopy for 
a subset. As such, it will clearly be central to any future analysis 
of the Galaxy. However, it will be far from the only data available. 
Rather than simply using Gaia data alone, a more precise analysis 
will be possible if data from existing and forthcoming surveys, be 
they photometric or spectroscopic, were used as well. 

However, a major problem is this wealth of data should be 
analysed. Clearly, concealed within the data will be a gold-mine of 
information on our Galaxy and, in theory, it should prove possible 
to make spectacular advances in understanding the current and past 
state of the Galaxy. This massive advantage though becomes a ma- 
jor hurdle, in so far as the sheer quantity and complexity of the data 
will considerably complicate its analysis. 

The method discussed in this paper, could be considered to be 
a prototypical example of how Gaia data might best be analysed. 
There are many potential improvements to the method, several of 
which have been discussed in section [5] which will not only in- 
crease its accuracy but also extend the scope of what it can achieve. 
If applied this would make it possible to not only map extinction 
in 3D, but also map Galactic structure, estimate the Galaxy's star 
formation history and more besides. 
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